Methods and systems for assessing excessive accessory listings in search results

ABSTRACT

A system and method for generating a score for a listing are described. A data field of a listing is parsed and at least one element from the data field is generated. A score for the listing is calculated based on the at least one element, the listing score representing a probability of the listing being in one of two binary classifications. The listing score and one or more listing attribute values are inputted into a binary classifier. An output is generated using the binary classifier, the output representing a refined score for the listing based on the listing score and at least one of the listing attribute values.

PRIORITY

This application is a continuation of and claims the benefit of priorityto U.S. patent Application Serial No. 13/082,226, filed on April 7,2011, which is hereby incorporated by reference herein in its entirety.

TECHNICAL FIELD

Example embodiments of the present disclosure generally relate tomethods and systems for assessing excessive accessory listings in searchresults.

BACKGROUND

A common feature of websites that facilitate electronic commerce is theclassification and categorization of goods, services, or other assets(collectively, “items”) offered for sale. Classification of goods andservices requires a balancing act between segmenting items into relevantcategories to make the transacting experience efficient anduser-friendly and ensuring that the categories remain broad enough thatpotential buyers and sellers can easily categorize or find items. Incertain instances, however, items may be miscategorized, or certaintypes of items may pollute or overwhelm the listings offered in acategory.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings, which are not necessarily drawn to scale, like numeralsdescribe substantially similar components throughout the several views.The drawings illustrate generally, by way of example, but not by way oflimitation, various embodiments discussed in the present document.

FIG. 1 is a network diagram depicting a network system, according toexample embodiments, having a client-server architecture configured forexchanging data over a network.

FIG. 2A is a block diagram illustrating multiple publicationapplications, which may be provided as part of a network-basedpublisher, in accordance with example embodiments.

FIG. 2B is a block diagram illustrating various modules of an analysisapplication, in accordance with example embodiments.

FIG. 2C is a block diagram illustrating modules of an analysisapplication, in accordance with example embodiments.

FIG. 3 is a flow chart illustrating an example method for classifyingitems.

FIG. 4 is a flow chart illustrating an example method for ranking searchresults based on an intention of a user query.

FIG. 5 shows a diagrammatic representation of machine in the exampleform of a computer system within which a set of instructions may beexecuted to cause the machine to perform any one or more of themethodologies discussed herein.

DETAILED DESCRIPTION

Although the present disclosure has been described with reference tospecific example embodiments, it will be evident that variousmodifications and changes may be made to these embodiments withoutdeparting from the broader spirit and scope of the present disclosure.Accordingly, the specification and drawings are to be regarded in anillustrative rather than a restrictive sense.

In various embodiments, a system, method, and machine-readable storagemedium storing a set of instructions for assessing excessive accessorylistings in search results are disclosed. A processor-implementedtextual mining module may parse a data field of a document and generateat least one token from the data field. A token may represent one ormore words, symbols, or numbers extracted from the data field. Aprocessor-implemented scoring module may calculate a score for the atleast one token, with the at least one token score representing alikelihood that the at least one token belongs to one of two binaryclassifications. The processor-implemented scoring module also maycalculate a score for the document based on the at least one tokenscore, with the document score representing a probability of thedocument being in one of the two binary classifications. Aprocessor-implemented decision tree module may input the document scoreand document attribute values into a decision tree and generate anoutput representing a refined score based on the document score and atleast one of the document attribute values.

FIG. 1 is a network diagram depicting a network system 100, according toone embodiment, having a client-server architecture configured forexchanging data over a network. For example, the network system 100 maybe a publication/publisher system 102 where clients may communicate andexchange data within the network system 100. The data may pertain tovarious functions (e.g., selling and purchasing of items) and aspects(e.g., data describing items listed on the publication/publisher system)associated with the network system 100 and its users. Althoughillustrated herein as a client-server architecture as an example, otherexample embodiments may include other network architectures, such as apeer-to-peer or distributed network environment.

A data exchange platform, in an example form of a network-basedpublisher 102, may provide server-side functionality, via a network 104(e.g., the Internet) to one or more clients. The one or more clients mayinclude users that utilize the network system 100 and more specifically,the network-based publisher 102, to exchange data over the network 114.These transactions may include transmitting, receiving (communicating)and processing data to, from, and regarding content and users of thenetwork system 100. The data may include, but are not limited to,content and user data such as feedback data; user reputation values;user profiles; user attributes; product and service reviews; product,service, manufacture, and vendor recommendations and identifiers;product and service listings associated with buyers and sellers; auctionbids; and transaction data, among other things.

In various embodiments, the data exchanges within the network system 100may be dependent upon user-selected functions available through one ormore client or user interfaces (UIs). The UIs may be associated with aclient machine, such as a client machine 106 using a web client 110. Theweb client 110 may be in communication with the network-based publisher102 via a web server 120. The UIs may also be associated with a clientmachine 108 using a programmatic client 112, such as a clientapplication, or a third party server 114 hosting a third partyapplication 116. It can be appreciated in various embodiments the clientmachine 106, 108, or third party application 114 may be associated witha buyer, a seller, a third party electronic commerce platform, a paymentservice provider, or a shipping service provider, each in communicationwith the network-based publisher 102 and optionally each other. Thebuyers and sellers may be any one of individuals, merchants, or serviceproviders, among other things.

Turning specifically to the network-based publisher 102, an applicationprogram interface (API) server 118 and a web server 120 are coupled to,and provide programmatic and web interfaces respectively to, one or moreapplication servers 122. The application servers 122 host one or morepublication application(s) 124. The application servers 122 are, inturn, shown to be coupled to one or more database server(s) 126 thatfacilitate access to one or more database(s) 128.

In one embodiment, the web server 120 and the API server 118 communicateand receive data pertaining to listings, transactions, and feedback,among other things, via various user input tools. For example, the webserver 120 may send and receive data to and from a toolbar or webpage ona browser application (e.g., web client 110) operating on a clientmachine (e.g., client machine 106). The API server 118 may send andreceive data to and from an application (e.g., client application 112 orthird party application 116) running on another client machine (e.g.,client machine 108 or third party server 114).

The publication application(s) 124 may provide a number of publisherfunctions and services (e.g., search, listing, payment, etc.) to usersthat access the network-based publisher 102. For example, thepublication application(s) 124 may provide a number of services andfunctions to users for listing goods and/or services for sale, searchingfor goods and services, facilitating transactions, and reviewing andproviding feedback about transactions and associated users.Additionally, the publication application(s) 124 may track and storedata and metadata relating to listings, transactions, and userinteractions with the network-based publisher 102.

FIG. 1 also illustrates a third party application 116 that may executeon a third party server 114 and may have programmatic access to thenetwork-based publisher 102 via the programmatic interface provided bythe API server 118. For example, the third party application 116 may useinformation retrieved from the network-based publisher 102 to supportone or more features or functions on a website hosted by the thirdparty. The third party website may, for example, provide one or morelisting, feedback, publisher or payment functions that are supported bythe relevant applications of the network-based publisher 102.

FIG. 2A is a block diagram illustrating an example embodiment ofmultiple publication application(s) 124, which are provided as part ofthe network-based publisher 102. The network-based publisher 102 mayprovide a multitude of listing and price-setting mechanisms whereby auser may be a seller or buyer who lists or buys goods and/or services(e.g., for sale) published on the network-based publisher 102. Thenetwork-based publisher 102 also may provide search mechanisms by whicha buyer or a seller may submit queries to search among existinglistings.

The publication application(s) 124 are shown to include, among otherthings, one or more application(s) which support the network-basedpublisher 102, and more specifically, the listing of goods and/orservices for sale, the searching of existing listings, and the analysisof existing listings.

Store application(s) 202 permit sellers to list individual goods and/orservices (hereinafter generically referred to as “items”) for sale viathe network-based publisher or group their listings within a “virtual”store, which may be branded and otherwise personalized by and for thesellers. Individual and grouped listings may include details such as atitle of an item offered for sale, a description of the item, a price ofthe item, one or more images of the item, a geographic location of theseller or the item, payment and shipping options, and a return policy.The virtual store also may offer promotions, incentives and featuresthat are specific and personalized to a relevant seller.

Listing application(s) 204 provide mechanisms to enable sellers to listindividual items for sale via the network-based publisher. Listingapplication(s) 204 may operate in cooperation with store application(s)202 to facilitate the listing of items. Listing application(s) 204 mayprovide listing templates that guide a seller through a step-by-stepprocess for listing an item for sale. In some example embodiments,listing application(s) 204 include additional functionality forstreamlining or simplifying the listing process. In some exampleembodiments, listing application(s) may include mobile deviceapplications that enable a user to scan a barcode of an item to populatea listing template used to generate a listing of an item to be sold.Mobile device applications also may enable a user to capture an image ofan item to initiate a listing generation process.

Search application(s) 206 may provide search user interfaces andfunctionality to enable users to identify listings matching submittedsearch queries. The search interfaces may support a multitude of searchmethodologies, such as keyword searches, category searches, itemidentifier searches, user searches, title searches, and so forth. Inresponse to a search query, the search application(s) 206 may identifyitems responsive to the terms included in the query. In some exampleembodiments, the search application(s) 206 may rank or sort the returnedsearch results according to a hierarchy. For example, search results maybe ranked by relevancy, best match, newest to oldest, price, and soforth.

Analysis application(s) 208 may include functionality to enable thenetwork-based publisher 102 to analyze existing item listings. Theanalysis may be performed to optimize the performance of thenetwork-based publisher 102, such as by improving the classification ofitem listings or by improving the performance of search queryprocessing, as well as to provide metrics and statistics to buyers andsellers interacting with the network-based publisher 102. For example,analysis application(s) 208 may analyze item listings to determinewhether items are properly classified in categories. In someembodiments, the analysis application(s) 208 may analyze listings todetermine whether a listing concerns a product or an accessory. In someembodiments, the analysis application(s) 208 may analyze listings todetermine whether an item listing is categorized in the correctcategory. In an example embodiment, the analysis application(s) 208 alsomay analyze search queries submitted by users to identify a searchintent of a user. In some embodiments, the search intent of a user maybe inferred by identifying categories of item listings that areresponsive to the search query and determining which of the identifiedcategories are dominant categories. Dominant categories (e.g., mostvisited categories, categories containing the most converted listings,that is, categories which, when returned, resulted in a purchase or saleof an item) may be more likely to include item listings that satisfy orrepresent the search intent of the user.

FIG. 2B is a block diagram illustrating an example embodiment of aclassifier module 210 which, in some embodiments, may include a textualmining module 212, a scoring module 214, and a decision tree module 216.The classifier module 210 or any of its sub-modules may be used by anyof the modules, systems, and methods disclosed herein to analyze itemlistings and enhance the processing of search queries. The modulesdisclosed herein may be implemented or executed by one or moreprocessors, or may configure one or more processors to perform thefunctionality of the modules disclosed herein.

A textual mining module 212 may parse fields of an item listing toidentify a type or category of a product. In some embodiments, thetextual mining module 212 determines whether an item listing is morelikely a product or a product accessory. In other embodiments, thetextual mining module 212 may parse the fields of a document generallyto perform a binary classification of the document. For example, thetextual mining module 212 may be used to parse one or more fields of adocument to determine whether an email is or is not spam, that is,unwanted and/or unsolicited email. In some embodiments, the textualmining module 212 parses the words of a title field of an item listinginto tokens. By tokenizing the title field of an item listing, thetextual mining module 212 may enable analysis of the item listing to beperformed by analyzing each word of one or more fields of the itemlisting.

A scoring module 214 may generate a score for each item listing based onthe tokens generated by the textual mining module 212. The scoringmodule 214 may assign a score to each tokenized word that indicateswhether the word is more likely to be in a first binary classificationor a second binary classification. For example, in the context of aproduct or product accessory determination, the scoring module 214assigns a score to each tokenized word that reflects whether the word ismore likely to be a word associated with a product or a productaccessory. In some embodiments, the score assigned to each word may be apositive or negative score. In other embodiments, the score may be a ‘1’or a ‘0’ or other binary scoring system (e.g., “Yes” or “No”, “Product”or “Not Product”) with one value indicating that the word is more likelyassociated with a product and the other value indicating that the wordis more likely associated with a product accessory.

The scoring module 214 may use or implement a naive Bayes classifierwhen generating a score for an item listing. The naive Bayes classifiermay be applied to calculate the probability that a document fallsexclusively within a first or a second binary classification (e.g., aproduct or a product accessory). For example, the probability that adocument (D) may be a product accessory (S) may be represented asp(S|D), while the probability that a document (D) may be a product(i.e., not a product accessory or −S) may be represented as p(−S|D). Foreach word, the naive Bayes classifier may represent the probability ofthe word (W) being within the accessories class as p(w_(i)|S). The naiveBayes classifier may use the equation below to determine a likelihood orprobability ratio of a document being an accessory to a document notbeing an accessory.

${\ln \frac{p( {SD} )}{p( {{- S}d} )}} = {{\ln \frac{p(S)}{p( {- S} )}} + {\sum\limits_{i}{\ln \frac{p( {w_{i}S} )}{p( {w_{i}{- S}} )}}}}$

From this equation, a document may be classified as a product accessoryif the probability of the document being a product accessory is greaterthan the probability of the document not being a product accessory, orp(S|D)>p(−S|D).

In some embodiments, a test set of documents is analyzed to teach amachine learning algorithm how to classify documents as being related toeither product accessories or products. The textual mining module 212tokenizes each word in the title field of each document, and the scoringmodule 214 performs a scoring of the word using the naive Bayesclassifier. For each word occurrence, the number of accessory versusnon-accessory determinations is calculated to form a ratio of productaccessory determinations to product determinations for each word. Forexample, if the word “charger” appears multiple times in item listings,it may be determined that in x instances, the word “charger” refers to aproduct accessory, while in y instances, the word “charger” refers to aproduct itself. The ratio of x to y may be calculated to determine onhow likely the word “charger” generally refers to a product accessory.

In an example embodiment, using the equations above as applied to thetokenized words from a sample set of documents, a ratio of theprobability of a document being a product accessory to the probabilityof a document not being a product accessory may be determined. In oneembodiment, the ratio R may be determined to be approximately 2.3,meaning that for a document or item listing to be classified as aproduct accessory, the words in the title need to have positive ratiosand the ratio should exceed approximately 2.3. In other embodiments, theratio R may be dependent on the sets of documents used to train themachine learning system for classification purposes.

The scoring module 214 may combine the scores for individual words inthe item listing fields to generate an item score. Each individualtokenized word may be given a score based on the probability calculatedby the naive Bayes classifier or based on the ratio R described above.The score may be normalized to within a range of ‘0’ or ‘1’. Forexample, if a word has 8594 occurrences in an accessory and 5600occurrences in a product, the score assigned to the word may be 0.605467(e.g., 8594 accessory occurrences divided by 14194 samples (the totalnumber of occurrences)). In some embodiments, the scoring module 214 maynormalize this score to 0.6, while in other embodiments, the scoringmodule 214 may normalize this score to 1 to indicate its propensity tobe associated with product accessories. A document or item listing'sscore may be calculated by summing the scores for each tokenized wordand dividing by the total number of words.

A decision tree module 216 may enhance the score of a document generatedby the scoring module 214. Because the document is scored by the scoringmodule 214 solely on the words of a field (e.g., the title) of adocument, its accuracy may be improved upon. The decision tree module216 combines the score generated by the scoring module 214 with variousdocument attributes to generate a refined score for each document. Forembodiments relating to electronic commerce, where documents are itemlistings offering items for sale, the document attributes that may beused include, but are not limited to, a starting price of an item,shipping methods and prices, product sales ranks, insurance options, afeedback score of a seller, a country in which the item is located, anitem sub-title, payment types, a quantity of the item, a return policyfor the item, historical traffic and demand information, and a medianprice calculated to predict at what the item should be priced. Thedecision tree module 216 may input some or all of these attributes alongwith the score generated by the scoring module 214 into a decision treethat outputs results in one or more decision trees using a combinationof the score generated by the scoring module 214 and any or all of theattributes discussed above. The results of the decision tree reflect afinal score or decision given to the item as to whether the item isconsidered to be a product accessory or product. In some embodiments, abit or indicator reflecting an output of the decision tree may be storedwith each item listing. The bit or indicator may denote whether the itemlisting is a product or a product accessory.

FIG. 2C is a block diagram illustrating an example embodiment of asearch engine optimization module 218 which, in some embodiments, mayinclude a query interpreter module 220 and a ranking module 222. Thesearch engine optimization module 218 may be used by any of the modules,systems, and methods disclosed herein to analyze item listings andenhance the processing of search queries. The modules disclosed hereinmay be implemented or executed by one or more processors, or mayconfigure one or more processors to perform the functionality of themodules disclosed herein.

In some embodiments, the query interpreter module 220 receives a searchquery from the search application(s) 206. In some embodiments, thesearch query may be a query submitted by a user for an item offered forsale via the network-based publisher 102. The query interpreter module220 may parse the search query and identify one or more dominantcategories related to the search query. In some embodiments, the queryinterpreter module 220 may identify one or more most visited categoriesrelated to the search query or one or more most converted listingcategories (i.e., categories which, when returned, resulted in apurchase or sale of an item) related to the search query. Identificationof the most relevant categories may be aided by reference to historicaldata associated with each category. For example, historical data mayshow that a category overwhelmingly stores product accessoriesassociated with the keywords included in the search query. In this case,the category may be deemed to be a product accessory category. Byidentifying the dominant categories relating to the search query, thequery interpreter module 220 may be able to determine an intent of thequery, namely in one example embodiment, whether the query is seeking aproduct or a product accessory.

The ranking module 222 receives the query intent determination from thequery interpreter module 220 and retrieves search results from one ormore databases that satisfy the query intent determination. For example,if the query interpreter module 220 determines that a search query seeksa product accessory, the ranking module 222 may receive thisdetermination and may retrieve stored product accessory listings fromthe categories identified as relevant by the query interpreter module220. The item listings may be retrieved by performing a keyword searchof the one or more databases storing item listings using the wordsincluded in the search query. In some embodiments, items stored in thedatabase(s) may be determined to be products or product accessoriesbased on a bit or indicator included in the item listing record thatrepresents whether the item listing is a product or a product accessory.In some embodiments, the categories and sub-categories of item listingsmay be separated into product and product accessory categories tofacilitate efficient retrieval of products or product accessories. Insome embodiments, as the item listings are retrieved from thedatabase(s), the ranking module 222 may rank or order the retrieved itemlistings for presentation to the user.

In some embodiments, item listings may be ranked by the score generatedby the scoring module 214 and decision tree module 216. For example, fora search returning item listings that are product accessories, itemlistings that are scored with greater probability of being a productaccessory in a given category may be presented with a higher order orrank than item listings having a lower probability of being a productaccessory. In some embodiments, item listings may be ranked or orderedby other metrics or attributes, such as price, seller reputation, timeremaining (if the item listing pertains to an auction), and so forth.

FIG. 3 is a flow chart illustrating an example method for classifyingitems. At operation 302, one or more data fields of documents or itemlistings are mined. Data mining operations may include parsing thefield(s) and tokenizing the words contained in the fields. At operation304, a naive Bayes classifier may be employed by a scoring module 214 toclassify documents or item listings as falling within one of two binaryclassifications. In some embodiments, the naive Bayes classifier mayclassify documents or item listings as either product accessories orproducts. Classification of documents or item listings is accomplishedby generating a score for each tokenized word from the field(s) of thedocument or item listing being analyzed. The score may be generated froma sample set of item listings used to train a machine learning algorithmto classify item listings. For each word in an item listing field (e.g.,title), the number of occurrences of that word in an accessory itemlisting is compared to the number of occurrences of that word in aproduct item listing. The occurrences are tallied and a ratio of numberof accessory occurrences to number of product occurrences for the wordis calculated. The ratio may be used to determine the probability orlikelihood that the word is indicative of a product or a productaccessory. The ratio also may be converted into a word score, which insome embodiments, may involve normalizing the ratio within a range of 0to 1.

The scores for the words in each item listing field (e.g., title) may becombined by summing the scores for the words and dividing by the numberof total samples. For example, each score may be a normalized ratio ofthe number of accessory occurrences to the number of productoccurrences, with each occurrence being considered a sample. The scoremay reflect a likelihood or probability that the item listing is aproduct accessory or a product. For example, the closer the normalizedscore is to 1, the more likely the item listing is a product accessory.

At operation 306, the score generated for the item listing may becombined with additional item attributes or factors. At operation 308,the score and additional attributes may be input into a decision treethat may use the inputs to generate a refined score for an item listing.The score may be refined through the use of the additional metrics,since the generation of the score in operation 304 may rely only on thetext contained in the field(s) (e.g., title) of the item listing.Additional metrics that may further enhance the score of the itemlisting may include the price of the item, the shipping cost of theitem, the return policy of the item, the product sales rank, theavailable quantity of the item, the location of the item, and variousseller-related information (e.g., seller reputation, seller volume).

At operation 310, the decision tree may output a refined score thatprovides a more comprehensive determination of the likelihood that anitem listing is either a product accessory or a product. In someembodiments, the score or the classification derived from the score maybe stored with the item listing to inform other components of the system(e.g., the search engine, classification modules) that the item listingis related to a product or a product accessory.

FIG. 4 is a flow chart illustrating an example method for ranking searchresults based on an intention of a user query. At operation 402, asearch query received from a user may be parsed. In some embodiments,the words comprising the search query may be tokenized for use inprocessing the search query. At operation 404, using the words of thesearch query, dominant categories may be identified. Dominant categoriesmay represent categories of item listings that are highly relevant tothe submitted search queries. For example, dominant categories may beidentified by searching stored categories with keywords derived from thesearch query. Categories that return a large number of results may beconsidered dominant categories. In addition to or instead of dominantcategories, most visited categories or most converted listing categories(i.e., categories that yielded a conversion, such as a sale, from asearch) may be returned.

At operation 406, the returned categories may be combined with itemlisting determinations within the returned categories. In someembodiments, item listings contained within the returned categories maybe grouped or organized into two classes corresponding to the classesused to separate item listings into binary classifications (e.g.,products and product accessories). Using the keywords contained in thesearch query, it may be determined which binary classification the useris searching for item listings. For example, if the user's queryincludes any of the words “charger,” “for,” “case,” or “cover,” amachine learning classifier may determine that the user is searching fora product accessory. In this case, item listings of the returnedcategories that are grouped in the product accessory category may bereturned.

At operation 408, the returned item listings may be ranked or orderedfor presentation to the user. By analyzing the words of the searchquery, search results of a relevant category and classification may bereturned to the user. The ordering or ranking of the search results maybe based on a multitude of factors, including the score assigned by theclassifier and decision tree, the price of the item, the amount of timeremaining to purchase the item, and so forth.

FIG. 5 shows a diagrammatic representation of a machine in the exampleform of a computer system 500 within which a set of instructions may beexecuted causing the machine to perform any one or more of themethodologies discussed herein. In alternative embodiments, the machineoperates as a standalone device or may be connected (e.g., networked) toother machines. In a networked deployment, the machine may operate inthe capacity of a server or a client machine in server-client networkenvironment, or as a peer machine in a peer-to-peer (or distributed)network environment. The machine may be a personal computer (PC), atablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), acellular telephone, a web appliance, a network router, switch or bridge,or any machine capable of executing a set of instructions (sequential orotherwise) that specify actions to be taken by that machine. Further,while only a single machine is illustrated, the term “machine” shallalso be taken to include any collection of machines that individually orjointly execute a set (or multiple sets) of instructions to perform anyone or more of the methodologies discussed herein.

The example computer system 500 includes a processor 502 (e.g., acentral processing unit (CPU), a graphics processing unit (GPU) orboth), a main memory 504 and a static memory 506, which communicate witheach other via a bus 508. The computer system 500 may further include avideo display unit 510 (e.g., a liquid crystal display (LCD) or acathode ray tube (CRT)). The computer system 500 also includes analphanumeric input device 512 (e.g., a keyboard), a user interface (UI)navigation device 514 (e.g., a mouse), a disk drive unit 516, a signalgeneration device 518 (e.g., a speaker) and a network interface device520.

The disk drive unit 516 includes a machine-readable medium 522 on whichis stored one or more sets of instructions and data structures (e.g.,software 524) embodying or utilized by any one or more of themethodologies or functions described herein. The software 524 may alsoreside, completely or at least partially, within the main memory 504and/or within the processor 502 during execution thereof by the computersystem 500, the main memory 504 and the processor 502 also constitutingmachine-readable media.

The software 524 may further be transmitted or received over a network526 via the network interface device 220 utilizing any one of a numberof well-known transfer protocols (e.g., HTTP).

While the machine-readable medium 522 is shown in an example embodimentto be a single medium, the term “machine-readable medium” should betaken to include a single medium or multiple media (e.g., a centralizedor distributed database, and/or associated caches and servers) thatstore the one or more sets of instructions. The term “machine-readablemedium” shall also be taken to include any medium that is capable ofstoring, encoding or carrying a set of instructions for execution by themachine and that cause the machine to perform any one or more of themethodologies of the present invention, or that is capable of storing,encoding or carrying data structures utilized by or associated with sucha set of instructions. The term “machine-readable medium” shallaccordingly be taken to include, but not be limited to, solid-statememories, optical media, and magnetic media.

Certain embodiments are described herein as including logic or a numberof components, modules, or mechanisms. Modules may constitute eithersoftware modules (e.g., code and/or instructions embodied on amachine-readable medium or in a transmission signal) or hardwaremodules. A hardware module is a tangible unit capable of performingcertain operations and may be configured or arranged in a certainmanner. In example embodiments, one or more computer systems (e.g., thecomputer system 500) or one or more hardware modules of a computersystem (e.g., a processor 502 or a group of processors) may beconfigured by software (e.g., an application or application portion) asa hardware module that operates to perform certain operations asdescribed herein.

In various embodiments, a hardware module may be implementedmechanically or electronically. For example, a hardware module maycomprise dedicated circuitry or logic that is permanently configured(e.g., as a special-purpose processor, such as a field programmable gatearray (FPGA) or an application-specific integrated circuit (ASIC)) toperform certain operations. A hardware module may also compriseprogrammable logic or circuitry (e.g., as encompassed within a processor502 or other programmable processor) that is temporarily configured bysoftware to perform certain operations. It will be appreciated that thedecision to implement a hardware module mechanically, in dedicated andpermanently configured circuitry, or in temporarily configured circuitry(e.g., configured by software) may be driven by cost and timeconsiderations.

Accordingly, the term “hardware module” should be understood toencompass a tangible entity, be that an entity that is physicallyconstructed, permanently configured (e.g., hardwired) or temporarilyconfigured (e.g., programmed) to operate in a certain manner and/or toperform certain operations described herein. Considering embodiments inwhich hardware modules are temporarily configured (e.g., programmed),each of the hardware modules need not be configured or instantiated atany one instance in time. For example, where the hardware modulescomprise a processor 502 configured using software, the processor 502may be configured as respective different hardware modules at differenttimes. Software may accordingly configure a processor 502, for example,to constitute a particular hardware module at one instance of time andto constitute a different hardware module at a different instance oftime.

Modules can provide information to, and receive information from, othermodules. For example, the described modules may be regarded as beingcommunicatively coupled. Where multiples of such hardware modules existcontemporaneously, communications may be achieved through signaltransmission (e.g., over appropriate circuits and buses) that connectthe modules. In embodiments in which multiple modules are configured orinstantiated at different times, communications between such modules maybe achieved, for example, through the storage and retrieval ofinformation in memory structures to which the multiple modules haveaccess. For example, one module may perform an operation and store theoutput of that operation in a memory device to which it iscommunicatively coupled. A further module may then, at a later time,access the memory device to retrieve and process the stored output.Modules may also initiate communications with input or output devices,and can operate on a resource (e.g., a collection of information).

The various operations of example methods described herein may beperformed, at least partially, by one or more processors 502 that aretemporarily configured (e.g., by software, code, and/or instructionsstored in a machine-readable medium) or permanently configured toperform the relevant operations. Whether temporarily or permanentlyconfigured, such processors 502 may constitute processor-implemented (orcomputer-implemented) modules that operate to perform one or moreoperations or functions. The modules referred to herein may, in someexample embodiments, comprise processor-implemented (orcomputer-implemented) modules.

Moreover, the methods described herein may be at least partiallyprocessor-implemented (or computer-implemented) and/orprocessor-executable (or computer-executable). For example, at leastsome of the operations of a method may be performed by one or moreprocessors 502 or processor-implemented (or computer-implemented)modules. Similarly, at least some of the operations of a method may begoverned by instructions that are stored in a computer readable storagemedium and executed by one or more processors 502 orprocessor-implemented (or computer-implemented) modules. The performanceof certain of the operations may be distributed among the one or moreprocessors 502, not only residing within a single machine, but deployedacross a number of machines. In some example embodiments, the processors502 may be located in a single location (e.g., within a homeenvironment, an office environment or as a server farm), while in otherembodiments the processors 502 may be distributed across a number oflocations.

While the embodiment(s) is (are) described with reference to variousimplementations and exploitations, it will be understood that theseembodiments are illustrative and that the scope of the embodiment(s) isnot limited to them. In general, techniques for the embodimentsdescribed herein may be implemented with facilities consistent with anyhardware system or hardware systems defined herein. Many variations,modifications, additions, and improvements are possible.

Plural instances may be provided for components, operations orstructures described herein as a single instance. Finally, boundariesbetween various components, operations, and data stores are somewhatarbitrary, and particular operations are illustrated in the context ofspecific illustrative configurations. Other allocations of functionalityare envisioned and may fall within the scope of the embodiment(s). Ingeneral, structures and functionality presented as separate componentsin the exemplary configurations may be implemented as a combinedstructure or component. Similarly, structures and functionalitypresented as a single component may be implemented as separatecomponents. These and other variations, modifications, additions, andimprovements fall within the scope of the embodiment(s).

The Abstract of the Disclosure is provided to comply with 37 C.F.R.§1.72(b), requiring an abstract that will allow the reader to quicklyascertain the nature of the technical disclosure. It is submitted withthe understanding that it will not be used to interpret or limit thescope or meaning of the claims. In addition, in the foregoing DetailedDescription, it can be seen that various features are grouped togetherin a single embodiment for the purpose of streamlining the disclosure.This method of disclosure is not to be interpreted as reflecting anintention that the claimed embodiments require more features than areexpressly recited in each claim. Rather, as the following claimsreflect, inventive subject matter lies in less than all features of asingle disclosed embodiment. Thus the following claims are herebyincorporated into the Detailed Description, with each claim standing onits own as a separate embodiment.

1. (canceled)
 2. A system, comprising: a processor-implemented textualmining module configured to parse a data field of a listing and generateat least one element from the data field; a processor-implementedscoring module configured to: calculate a score for the listing based onthe at least one element, the listing score representing a probabilityof the listing being in one of two binary classifications; and aprocessor-implemented binary classifier module configured to generate anoutput representing a refined score for the listing based on the listingscore and at least one listing attribute value.
 3. The system of claim2, wherein the processor-implemented scoring module uses a naive Bayesclassifier to calculate a score of the at least one element.
 4. Thesystem of claim 2, wherein the listing is an item listing of an itemoffered for sale, and wherein the two binary classifications areproducts and product accessories.
 5. The system of claim 2, wherein theprocessor-implemented scoring module is configured to calculate theelement score by determining a first number of occurrences of the atleast one element in a first binary classification in a set of itemlistings and a second number of occurrences of the at least one elementin a second binary classification in the set of item listings and byobtaining a ratio of the first number of occurrences to a sum of thefirst number of occurrences and the second number of occurrences.
 6. Thesystem of claim 5, wherein the processor-implemented scoring module isfurther configured to calculate the element score by normalizing theratio to derive the element score.
 7. The system of claim 5, wherein theprocessor-implemented scoring module is configured to calculate thelisting score by adding together the first number of occurrences for aplurality of the at least one element and dividing the added firstnumber of occurrences for the at least one element by the sum of thefirst number of occurrences and the second number of occurrences.
 8. Thesystem of claim 4, wherein the at least one listing attribute valueincludes at least one of quantity of the item, price of the item, medianprice of the item, product sales rank, shipping cost, return policy,seller feedback score, and item location.
 9. A computer-implementedmethod, comprising: parsing, by at least one processor, a data field ofa listing and generating at least one element from the data field;calculating, by the at least one processor, a score for the listingbased on the at least one element, the listing score representing aprobability of the listing being in one of two binary classifications;inputting the listing score and one or more listing attribute valuesinto a binary classifier; and generating an output using the binaryclassifier, the output representing a refined score for the listingbased on the listing score and at least one of the listing attributevalues.
 10. The computer-implemented method of claim 9, wherein thelisting is an item listing of an item offered for sale, and wherein thetwo binary classifications are products and product accessories.
 11. Thecomputer-implemented method of claim 9, wherein a calculating of atleast one element score uses a naïve Bayes classifier and comprises:determining a first number of occurrences of the at least one element ina first binary classification in a set of item listings and a secondnumber of occurrences of the at least one element in a second binaryclassification in the set of item listings; and calculating a ratio ofthe first number of occurrences to a sum of the first number ofoccurrences and the second number of occurrences.
 12. Thecomputer-implemented method of claim 11, wherein the calculating of theat least one element score further comprises normalizing the ratio toderive the at least one element score.
 13. The computer-implementedmethod of claim 11, wherein the calculating of the listing scorecomprises: adding together the first number of occurrences for aplurality of the at least one element; and dividing the added firstnumber of occurrences for the at least one element by the sum of thefirst number of occurrences and the second number of occurrences. 14.The computer-implemented method of claim 9, wherein the listingattribute values include at least one of quantity of the item, price ofthe item, median price of the item, product sales rank, shipping cost,return policy, seller feedback score, and item location.
 15. Anon-transitory machine-readable storage medium storing a set ofinstructions that, when executed by at least one processor, causes theat least one processor to perform operations comprising: parsing a datafield of a listing and generating at least one element from the datafield; calculating a score for the listing based on the at least oneelement, the listing score representing a probability of the listingbeing in one of two binary classifications; inputting the listing scoreand one or more listing attribute values into a binary classifier; andgenerating an output using the binary classifier, the outputrepresenting a refined score for the listing based on the listing scoreand at least one of the listing attribute values.
 16. Themachine-readable storage medium of claim 15, wherein the listing is anitem listing of an item offered for sale, and wherein the two binaryclassifications are products and product accessories.
 17. Themachine-readable storage medium of claim 15, wherein the calculating ofthe at least one element score comprises: determining a first number ofoccurrences of the at least one element in a first binary classificationin a set of item listings and a second number of occurrences of the atleast one element in a second binary classification in the set of itemlistings; and calculating a ratio of the first number of occurrences toa sum of the first number of occurrences and the second number ofoccurrences.
 18. The machine-readable storage medium of claim 17,wherein the calculating of the at least one element score furthercomprises normalizing the ratio to derive the at least one elementscore.
 19. The machine-readable storage medium of claim 17, wherein acalculating of at least one element score uses a naïve Bayes classifierand comprises: adding together the first number of occurrences for aplurality of the at least one element; and dividing the added firstnumber of occurrences for the at least one element by the sum of thefirst number of occurrences and the second number of occurrences. 20.The machine-readable storage medium of claim 15, wherein the listingattribute values include at least one of quantity of the item, price ofthe item, median price of the item, product sales rank, shipping cost,return policy, seller feedback score, and item location.